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Abstract 

Can a dynamical system paint masterpieces such as Da Vinci’s Mona Lisa or 
Monet’s Water Lilies? Moreover, can this dynamical system be chaotic in the sense 
that although the trajectories are sensitive to initial conditions, the same painting 
is created every time? Setting aside the creative aspect of painting a picture, in 
this work, we develop a novel algorithm to reproduce paintings and photographs. 
Combining ideas from ergodic theory and control theory, we construct a chaotic 
dynamical system with predetermined statistical properties. If one makes the spa¬ 
tial distribution of colors in the picture the target distribution, akin to a human, 
the algorithm first captures large scale features and then goes on to refine small 
scale features. Beyond reproducing paintings, this approach is expected to have a 
wide variety of applications such as uncertainty quantification, sampling for efficient 
inference in scalable machine learning for big data, and developing effective strate¬ 
gies for search and rescue. In particular, our preliminary studies demonstrate that 
this algorithm provides significant acceleration and higher accuracy than competing 
methods for Markov Chain Monte Carlo (MCMC). 

Is it possible to design a dynamical system that paints like a human? Given the 
availability of efficient modern printing technologies, this may seem like a trivial problem. 
However, the manner in which a modern printer prints is fundamentally different when 
compared to a human painting a picture. Roughly speaking, a printer scans each pixel 
in a pre-determined order and, using a color palette, deposits the appropriate amount of 
ink for each pixel. In comparison, given the same color palette, a human paints by first 
capturing the high level (or large scale) features of the picture and then goes on to fill in 
the low level (or detailed) features. In this paper, we are not attempting to model or mimic 
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the intelligence and creativity of humans when perceiving and painting pictures. Rather, 
our objective is to design an algorithm that reproduces the human actions of painting by 
first capturing the large scale features followed by small scale details. Our algorithm is 
based on the construction of a deterministic (no randomness or noise) dynamical system 
(described by a set of governing differential equations), which visits states with frequencies 
prescribed by a user defined distribution. 

Our approach is related to the theory of ergodicity. Ergodic systems are dynamical 
systems with the property that time averages of functions along trajectories are equal to 
spatial averages; the associated statistical distributions are known as invariant measures 
of the system [T]. In this work, we construct an ergodic dynamical system where one 
can prescribe the statistical distributions of the underlying dynamics. By prescribing the 
color distributions as target distributions for the ergodic system, individual trajectories 
for each color (the empirical distributions) converge to the desired invariant distributions, 
thus tracing out the original painting or picture. 

The underlying dynamical system is chaotic in the sense that it exhibits sensitivity to 
initial conditions (as shown in the supplementary material, the system has three Lyapunov 
exponents > 0), but nonetheless leads to robust recreation of the picture irrespective of 
initial condition. Note that positive Lyapunov exponents are the primary characteristic 
of chaos M. This algorithm can potentially be used to drive a robot that reproduces 
paintings/pictures. The ramifications of this approach extend beyond the applicability of 
designing robotic systems that can paint BSE!- The challenging task of efficient sampling 
of complex probability distributions lies at the heart of a wide range of problems. For ex¬ 
ample, sampling probability distributions is one of the most important tasks in statistical 
inference and machine learning, and is typically achieved by Markov Chain Monte Carlo 
(MCMC) methods PH, 151 [ffl HU], Variational methods [TT , IT2] are a popular alternative 
for statistical inference that rely on the construction of bounds on the likelihood function 
that may not always be tight m- In this work, we restrict ourselves to MCMC based 
sampling approaches. MCMC methods are often plagued by slow mixing [2], partic¬ 
ularly when distributions are complex and multi-modal. We believe that our approach 
presents an exciting alternative for sampling complex distributions for Bayesian inference 
and machine learning in the big data setting. In the supplementary material accompa¬ 
nying this manuscript, we present comparisons of Metropolis-Hastings [T5], Hamiltonian 
MCMC |i6j 37], and slice sampling [18] with chaotic sampling. We find that our ap- 
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proach provides higher accuracy and faster computation than all three methods in low 
dimensions. Additionally, the equations in our approach are easy to construct and can be 
evolved using Euler or Runge-Kutta integration schemes. 

Beyond machine learning and statistical inference, the ability to design dynamical 
systems with desired properties has a multitude of practical applications. With the emer¬ 
gence of 3-D printing [19] as an approach for quick prototyping of new mechanical parts, 
the task of achieving desired material distribution becomes increasingly important in 
aerospace and automobile applications. We imagine that our algorithm can potentially 
be used to print non-trivial parts, with desired space averaged material properties. This 
can be achieved by precisely controlling the continuous motion of the printing head using 
a dynamical system whose trajectory has the desired time averaged statistical properties. 
Furthermore, we envision that our approach will be used to design coordinated robotic 
systems that mimic biological swarms for information collection [20] (where the statisti¬ 
cal time averaged distributions of the trajectories matches the distributions of expected 
information), construct bio-molecules and associated models with desired statistical be¬ 
havior or conformations |2H E2], and build ergodic micro-mixers with optimal mixing 
properties [23]. The task of analyzing and developing stem cells with appropriate prop¬ 
erties of proliferation and differentiation can be modeled using dynamical systems [23]. 
Thus, one can potentially use the construction of appropriate dynamical systems to con¬ 
trol gene expressions m and endow stem cells with the desired statistical proliferation 
and differentiation properties. 

Mathematically, the ‘painting’ problem described above reduces to the problem of 
designing a continuous time dynamical system whose states sample an arbitrarily complex 
probability distribution defined over the state space. Leveraging the algorithms described 
in [23], we construct a dynamical system that ‘paints’ the desired picture. The approach 
works by decomposing the given picture into its red, green, and blue components, denoted 
by fi R , /i G , and fi B respectively. Note that each one of the /r’s captures the spatial 
distribution of the corresponding color over the entire picture. We then construct three 
dynamical systems, one for each one of the color distributions, such that the states of 
these dynamical systems, denoted by x R , x G , and x B , correspond to the two dimensional 
locations for the red, green, and blue paintbrushes respectively. 

To recreate paintings, we would like to design trajectories such that x R is ergodic with 
respect to the distribution fi R , x° is ergodic with respect to the distribution ji G and so 
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on. In other words, we would like to construct ergodic dynamical systems [261 EZ] with 
invariant measures given by p R , p G , and p B . Thus, for each color we construct a separate 
dynamical system, as described below. To sample a given probability distribution /i, we 
use the notion of a coverage distribution whose support is the set of points in the state 
space that have already been visited by the generated trajectory. This is defined as: 

C(P) = \ $x(T)(p)dr, ( 1 ) 

where p is a point in M 2 , f is the time, 6 is the Dirac delta function, and x is the trajectory 
for a single color. We would now like the distributions C to “weakly” converge to the 
distributions /i as f —> oo. The difference between the coverage distribution C and p, is 
denoted by 0(f), and is defined as, 


<t> 2 {f) = \\C - h\\ 2 h - 3/ 2 , (2) 

where f/~ 3 / 2 denotes the negative Sobolev space norm that captures how close C and 
H are in a “weak” sense. This is equivalent to minimizing the difference between the 
weighted Fourier expansions of both C and //. Thus, the metric 0(f) on a two dimensional 
rectangular domain is computed using, 


0 2 (f) 

where, 

Ck(t) 
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l 
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{C, f k ) , and /i fe 


fk(x,y) = 
(h) fk) ) 


h) 


/k x 7rx . .kyiry^ 

cos(——)cos(——), 

J_J X J-Jy 


(3) 

(4) 

(5) 


and we take x — [x,y\, k — [k x , k y ] is the corresponding wave-number vector, [L x , L y \ are 
the dimensions of the painting or picture, and < .,. > denotes the standard inner product 
between functions. By minimizing an appropriate function of 0(f) [25]], one essentially 
forces the Fourier coefficients of C (denoted by c*,) to converge to the Fourier coefficients of 
H (denoted by fik), but with greater importance given to the large-scale modes (captured 
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by Afc). The dynamical system that achieves this minimization is, 


where 



~dfk dfk 
dx ’ dy 


2 


( 6 ) 

(7) 


and [B x (t), B y (t )] = ^ A k t(c k (t) - n k ) 


( 8 ) 


k 


Here u max is the maximum speed of the paintbrushes. A k serves as a weighting factor 
that gives greater importance to the large scale features than the finer ones. Of course, 
the number of Fourier terms has to be truncated to a fixed value K = K x x K y , where we 
assume that the maximum value for k x is K x and k y is K y . The higher the value of K, the 
more detailed are the features of the reproduced painting. Additionally, since C —> ji in a 
“weak” sense as t —> oo, the longer one runs the simulation the “closer” is the reproduced 
image to the original. Thus, the reader may be tempted to pick large values for K and T 
(time for simulation), however, it is important to note that computational burden scales 
as 0(Plog(P) + KT), where P is the number of pixels. Note that the 0(Plog(P)) 
complexity arises due to the fast Fourier transform of p in Eq. [5j For more information 
about the underlying mathematical theory and extensions to higher dimensions, we refer 
the reader to the supplementary material. Also, note that the resulting dynamical system 
described in Eqs (|6]{8]) is chaotic (please refer to the supplementary material). 

The main steps of our algorithm can be summarized as follows: 

1. The original image is decomposed into its red, green, and blue components yielding 
y R , p G , and y B respectively. 

2. Fourier coefficients (based on a preselected value of K ) of the three color distribu¬ 
tions obtained in the previous step are computed. 

3. For each one of the colors, the dynamical system described by Eqs (J6]|8]) is evolved 
for a prescribed amount of time T. The fraction of time that each trajectory spends 
in a pixel determines the intensity of the corresponding color for that pixel. 

To demonstrate our algorithm we use Leonardo Da Vinci’s iconic painting - the Mona 
Lisa. Figure [l] shows the evolution of the Mona Lisa as generated by chaotic sampling. 
The trajectories of the red, green, and blue paintbrushes are shown in Figure [TJ Here 


5 





the red, green, and blue images are generated by a single trajectory for each color, this 
is akin to a continuous motion for each paintbrush that eventually produces the desired 
distribution of color. The picture is reproduced by just three individual trajectories, one 
corresponding to each color. As the computation progresses, the Mona Lisa image emerges 
on the superimposition of the three trajectories. Note that to evolve the equations in [6j 
we use a simple explicit Euler scheme with a step size of dt = 1CU 3 . The associated movie 
(Movie SI) displays the remarkable evolution of the reproduction of the Mona Lisa. 

Figure [2] shows various pictures and paintings along with the corresponding reproduc¬ 
tions by chaotic sampling. As seen in the cases of Big Sur and Pines Switz, even though 
the original pictures are real photographs, the reproductions by our dynamical system 
look surprisingly similar to human paintings. For more information, we refer the reader 
to Figs. S4-S6 and Movies S2-S4 in the supplementary material. 

Note that a host of Markov Chain Monte Carlo (MCMC) techniques for sampling 
distributions have been developed over the years - particularly for Bayesian inference 
and machine learning [9j TO]. In fact, MCMC methods are a critical step in various 
statistical and machine learning approaches; thus, these methods form the basis of a 
very active research community. We compare our chaotic sampling methodology (for 
[K x ,K y ] = [80,80]) with Metropolis-Hastings [15], [28], Hamiltonian MCMC [IT] and slice 
sampling [18] (popular methods for sampling distributions in a wide variety of applica¬ 
tions). We use all three methods to sample a multi-modal distribution in two dimensions 
and find that in comparison to these methods, chaotic sampling provides higher accu¬ 
racy with faster speeds of computation (see supplementary material for further details). 
Note that chaotic sampling is not based on constructing Markov chains and in this way 
is fundamentally different from traditional approaches. Unlike traditional Markov chains 
that are based on the last sampled point, successive points in chaotic sampling are picked 
based on the entire history of trajectories (see supplementary material). 

Our exposition in the text restricts chaotic sampling to two dimensions, however, 
there is no such restriction. One can construct a dynamical system to sample probability 
distributions in any dimension d. The general, d- dimensional, formulation of chaotic 
sampling is discussed in the supplementary material. However, the size of the underlying 
dynamical system in chaotic sampling explodes as K X1 x K X2 x ... K Xd , where K x . is the 
wave number in each direction. Our current efforts are focused around addressing this 
undesirable scaling of the chaotic sampling methodology. 
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Conclusions 


In this work, we have developed an algorithmic approach to construct dynamical systems 
with prescribed statistical properties. We demonstrate that our approach can be used 
to design chaotic dynamical systems that reproduce paintings and photographs. Akin to 
a human painter, the dynamical system first captures the large scale features and then 
fills in the finer details. The given picture is decomposed into its color components, thus 
yielding distributions of red, green, and blue (or equivalently cyan, magenta, yellow, and 
key) colors. The algorithm then constructs a separate dynamical system for each color 
that optimally samples the corresponding color distribution. In a robotic system, these 
dynamical systems will provide the instructions for each paintbrush with associated colors. 
These dynamical systems statistically sample the prescribed distributions, consequently, 
the results are independent of initial conditions. The resulting equations for chaotic 
sampling are shown (in the supplementary material) to have three positive Lyapunov 
exponents, implying sensitive dependence to initial conditions, a key property of chaotic 
systems [2|. The paintings are the “attractors” for this dynamical system. 

Additionally, in our supplementary material, we investigate the utility of the chaotic 
sampling approach for machine learning and Bayesian inference in big data settings. In 
particular, we demonstrate significant gains (in accuracy and convergence time) over tra¬ 
ditional MCMC [9] methods. We compare chaotic sampling to slice sampling, Hamilto¬ 
nian MCMC, and Metropolis-Hastings on a multi-modal test example in two dimensions. 
Our approach has the advantage over Metropolis-Hastings that it does not require the 
construction of proposal distributions. The construction of these distributions can be 
challenging. Moreover, given that the problem of designing systems with prescribed sta¬ 
tistical properties arises in numerous applications such as 3-D printing [T9] , biological 
systems pH E2] and microfluidics [23], we anticipate our approach will also be valuable 
in these scenarios. 
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(a) Time = 0.016 sec 



(b) Time = 0.046 sec 



(c) Time = 0.076 sec 



(d) Time = 0.106 sec 



(e) Time = 0.151 sec 

Figure 1: Evolving reproduction of the Mona Lisa as recreated by chaotic sampling. The 
first frame is the superposition of the red, green, and blue frames. Note that the red, 
green, and blue frames are composed of a single trajectory for each color evolving over 
time. ]_o 









(a) Original Big Sur Photograph 



(c) Original Pines Switz Photograph 



(e) Original Starry Night by Van Gogh 



(b) Big Sur reproduced using chaotic sam¬ 
pling 



(d) Pines Switz reproduced using chaotic 
sampling 



(f) Starry Night reproduced using chaotic 
sampling 


Figure 2: Various pictures and paintings generated by chaotic sampling. 
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Supplementary Text 

Ergodic Trajectory Generation 

In this section, we construct dynamical systems that sample prescribed distributions. Let 
the dynamical system trajectory be denoted by x(t) G U C Note that in the case 
of paintings d — 2 (as paintings and pictures are two dimensional). To keep track of the 
points already visited by the dynamical system, we define the coverage distribution (at 
location p e M d ), generated by the trajectory x as, 

C\p) = l [ 5g( T )(p)dT, (9) 

t Jo 

where t is the time and 8 is the Dirac delta measure. For the coverage distribution defined 
above, let us compute its spherical integrals given as: 


d(p,r) = (C,XB(p,r)) = I C(z)dz. (10) 

B(p,r) 

where B(p, r ) = {z : \\z — p\\ < r} and XB(p,r) is the indicator function on the set B(p, r). 
The resulting spherical integral d(p,r) has a very useful interpretation; d(p,r) is the 
fraction of time spent by the trajectory in the set B(p,r). 

The coverage metric we use is a distance between the coverage distribution and a 
probability distribution /i. Here /i represents the significance (or importance) that should 
be given to each individual point in the domain. In other words, let /j be the target 
distribution defined over a region U C M d . Consider a distance given by comparing the 
differences in the spherical integrals of C and /j: 

E 2 {t) = j J (d(p,r) — n(B(p,r))) 2 dpdr. (11) 

° u 

Also consider the distance given by the Sobolev space norm which can be expressed as, 


y(«) = lic(-)-M.)ilb>«= £ a*m«) - w .| 2 , 


where A k = - - +1)/2 , c k (t) = ( C , f k ) and p k = (p, f k ). 

(i + Flr) 


Here {f k } are the d-dimensional Fourier basis functions that satisfy Neumann boundary 
conditions on the domain U, k — [k xi , k X2 ,..., k Xd ] is the corresponding wave-number 
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vector which belongs to 7L* d = [0,1,2,.. ,] d . F° r instance, on a rectangular domain U = 
[0 ,L X ] x [0 ,L y \, x = [x,y\, and k = [k x ,k y \. This gives, 

„ . . 1 .k x 7TX. ,k v THJ. , . , 

fk(x, y) = — cos( ——) cos( y —), where £ 7 , k y = 0,1, 2.... (13) 

ilk -Lsy 

where hk is the normalization constant. The two metrics E and (j) described above are 
equivalent, i.e., there exist bounded constants ci,c 2 such that, 

Ci4> 2 < E 2 < c 2 0 2 . (14) 


For more details on the relationship of these metrics to concepts in ergodic theory, we 
refer the reader to [25] . Since the metric cj) is easier to compute, we use <f> for control 
design. We aim to design a control such that —y 0. This ensures that the distribution 

C(.) —* ju(.) as t —> oo. 

Assume that the control u(t) describes the sampling trajectory for the system, 


x{t) = u(t). 


(15) 


Let us define the following vector for each dynamical system: 

Bit) = k s k {t)V fk{x{t)). 

k 


(16) 


Here, Sk = t(ck(t) — Hk) and V fk(x(t)) is the gradient of the Fourier basis function 
evaluated at state x(t). The coverage control is given as: 


u{t) 


m 

\\ m \\2 


where u max is the maximum rate of evolution of each dynamical system. 

Thus, the overall equations (for each color or dynamical system) are given by, 


(17) 


x(t) = u(t), 
u(t) = u ma 


m 


W)l | 2 

B(t) = y^A fcgfc(t)V/fc(f(t)), 

k 

I^k = (Mj fk)i 

Ck = ( C,fk ) = 7 / fk{x{r))dT. 

t ./n 


(18) 
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The paintings and pictures are reproduced by evolving the above equations in two 
dimensions and setting the color distributions to fi. The trajectories for each color are then 
superimposed to produce the overall painting or picture. Note that the computation time 
for sampling a single /j, can potentially be accelerated by computing, in parallel, multiple 
trajectories starting from different initial conditions and superimposing the results. 


Lyapunov Exponents of the Dynamical System 


One of the key signatures of chaos is the sensitive dependence of the underlying dynamics 
to initial conditions [2], The Lyapunov spectrum [2,3] is defined as, 


A* 


lim — 

T-+ oo T 


log( 


A j(T) 

Aj(0) 


(19) 


where Aj(0) is the initial perturbation of the trajectory along the i-th principal axis. 
Similarly, A*(T) is the divergence of the trajectory from the original trajectory, at time 
T, along the i-th principal axis. Consider a two dimensional (x = [x,y]) rectangular 
region of dimensions [L x , L y \ with a uniform prior (equivalent to a picture with identical 
pixels). To compute the Lyapunov exponents we use the approach outlined in [3]. The 
resulting equations are given below, 


x{t) 

y{t) 

Skit) 

where, 

B x {t) 

By (t) 


B, 


-U r 


Ur\ 


'Bt + Bl ’ 


B,, 


b* + b*' 


fk f^ki 


y ^ A/jSfc (t) 
k 

~y ^ A/jSfc it) 

k 


df k jx,y) 

dx 

dfkjx,y) 

dy 


( 20 ) 


where fk are defined in Eqn. [T3j Note that for accurate computation of Lyapunov ex¬ 
ponents, the equations for the entries of the Jacobian are typically included [3]. For the 
above equations, one can derive analytical expressions for the entries of the Jacobian, 
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given by, 


- dx 

dx 

dx 

~ 

dx 

dy 

ds k 


dy 

dy 

dy 


dx 

dy 

9s k 


ds k 

ds k 

ds k 


- dx 

dy 

ds k 



where, 

dx _ D [B y B' x - B x B' y \ 
«(•)“ “ maX * (B* + B*) § 

p [-Sx-B^ --B1/-B*] 

«(•) 1 + 

ds k _ df k 

dx dx ’ 
ds k _ df k 

dy dy ’ 


( 21 ) 


( 22 ) 


Thus, by analytically calculating B x and B y with respect to x, y and Sk, one can compute 
the dynamics of the Jacobian. The dimensionality of this system of equations depends 
on the number of wave functions that are included in the expansion. In particular, the 
dynamics of x, y and Sk give rise to M — K x K y + 2 equations, where K x and K y are the 
maximum values for k x and k y respectively. Consequently, the dynamics of the Jacobian 
is determined by M 2 equations. 

Using the above equations along with the approach outlined in [3], we can compute 
the dynamics of the entire spectrum of Lyapunov exponents (as shown in Fig. SI). It 
can be seen that the system has three positive Lyapunov exponents, where the largest 
exponent has an asympototic value of ~ 0.1. Also, note that the information dimension 

N 

of the attractor [3] is not defined since A * > 0. 

i=l 

Thus, the dynamical system is chaotic since it displays sensitive dependence to initial 
conditions. Note, however, that the final statistical distribution is invariant and indepen¬ 
dent of initial conditions. 
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Comparison of Chaotic sampling with Metropolis-Hastings, Hamil¬ 
tonian MCMC and Slice Sampling 

In this section, we investigate the use of chaotic sampling to sample distributions for 
machine learning and statistical applications [10]. Markov Chain Monte Carlo (MCMC) 
methods are used extensively in the areas of machine learning and Bayesian statistics [8,9], 
computational physics, and rare event sampling. Additionally, these methods are exten¬ 
sively used in the big data setting. In fact, MCMC methods are often a critical step in 
various statistical and machine learning approaches; thus, these methods form the basis 
of a very active research community. We compare our chaotic sampling methodology 
with Metropolis-Hastings [15], Hamiltonian MCMC [17] and slice sampling [18] (popular 
methods for sampling distributions in a wide variety of applications). We use all three 
methods to sample a multi-modal distribution in two dimensions and find that compared 
to competing methods, chaotic sampling provides higher accuracy with faster speeds of 
computation. As mentioned previously, chaotic sampling is not based on constructing 
Markov chains and in this way is fundamentally different from traditional methods. Un¬ 
like traditional Markov chains that are based on the last sampled point, successive points 
in chaotic sampling are picked based on the entire history of the trajectories. We now 
give a brief description of Metropolis-Hastings, Hamiltonian MCMC, and slice sampling, 
and present comparisons. 

Metropolis-Hastings is a popular Markov Chain Monte Carlo approach that proceeds 
by generating samples using a proposal distribution that are then accepted or rejected 
based on the target distribution [28]. Thus, Metropolis-Hastings requires the tuning of 
a proposal distribution, Q(xt+i\xt), that proposes the new state x t +i based on x t . Once 
the new point x t +i is generated using the proposal distribution, it is accepted or rejected 
using threshold criteria on the target distribution /J,(x). The construction of the proposal 
distribution can be particularly challenging [15], and requires much trial and error. For 
our approach, we pick, 

Q(x t+ i\x t ) = J\f(x t ,a), (23) 

where A f is the Gaussian distribution with the arguments of the mean and standard 
deviation. We find that a standard deviation of cr = 1 (in both x and y directions) works 
best for the multi-modal example described later in this section. 

The Hamiltonian MCMC [16,17] is an approach for sampling distributions that is 
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based on the imposition of a Hamiltonian structure on the underlying dynamics. Hamil¬ 
tonian MCMC proceeds by considering the original variables or states as the “position 1 ' 
variables and appending “momentum” variables. The distribution of momentum variables 
is typically assumed to be Gaussian [16]. One then simulates a Markov chain by resam¬ 
pling the momentum variables and then performing Metropolis updates on the position 
variables. Note that proposal states in Hamiltonian MCMC are not generated by an ex¬ 
plicit proposal distribution, but by the momentum variable updates. The evolution of the 
dynamics of the Hamiltonian system is typically performed using symplectic integrators 
[16]. The advantage of Hamiltonian MCMC over Metropolis-Hustings is that correlation 
between successive samples is avoided by using a Hamiltonian structure on the under¬ 
lying states [16]. The disadvantages are that the target distribution of the momentum 
variables can be difficult to design and the number of states have to be doubled due to 
the Hamiltonian structure (since one must introduce momentum variables). 

Slice sampling [18] is based on uniformly sampling the graph of a density function. 
This is achieved by alternatively sampling (using uniform distributions) the state and 
probability spaces. Essentially, one uniformly samples points from the vertical interval 
defined by the density of the current point, followed by uniform sampling of the union of 
intervals that constitute the horizontal “slice” of the density function. The advantages of 
slice sampling are that one does not require any parameter or proposal density selection 
and ease of implementation. Our results on slice sampling are based on the default 
implementation in the MATLAB software package. 

To compare our chaotic sampling approach with Metropolis-Hastings [15], Hamiltonian 
MCMC [17], and slice sampling [18], we pick a multi-modal distribution by normalizing 
the sum of three Gaussian distributions in two dimensions. The first Gaussian is centered 
at (—2.0, —2.0), the second Gaussian at (2.0, 2.0), and the third Gaussian at (—2.0, 2.0). 
All the Gaussian distributions have a standard deviation of 0.5 and a correlation p = 0 
(see Figure S2). Thus, the probability distribution is given by, 
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The advantage of picking a distribution of this form is that the mean and higher 
moments can be computed analytically. For example, the mean of p(x,y) is given by 
m x = mx i +m p +mx 3 an d my — n, yi+ m m+ m v3 _ We generate 8000 samples using the follow¬ 
ing methods: chaotic sampling, Metropolis-Hastings (with the proposal density given in 
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Eqn. 23), Hamiltonian MCMC, and slice sampling, treating the distribution in Eqn. 24 


(Figure S2) as the target distribution. We then compute the statistical error of the var¬ 
ious approaches (with respect to the analytical closed form solutions) as function of the 
number of samples. 

We can compare the moments of any observable on the x = (x, y ) space. For simplicity, 
we choose E(x), note however, that any complicated integral can be used for comparison. 
The results are averaged over 10 trials (with 8000 samples each) and the convergence of 
the error in the predicted mean of x is shown in Figure S3. 

Note that for chaotic sampling we pick [K x , K y ] = [80, 80] and dt — 0.1 for the explicit 
Euler integration scheme. We generate the samples using the dynamical system described 


by Eqns. 15 and [17] We first deterministically run the dynamical system and then, to 
penalize points on the trajectory that lie between the peaks (since they only serve to 
connect regions of high probability), we reject the points on the trajectory where y(x, y) 
lies below a threshold (we pick 0.05 in this case). Note that this point rejection step is 
performed after the generation of all the points, and has complexity 0([^). 

The comparison of all the methods is presented in Figure S3. The figure shows the 
error in the estimate of the mean of the two dimensional multi-modal distribution as a 
function of the number of samples (the results are averaged over 10 independent runs). It 
is clear in Figure S3 that the chaotic sampling method converges significantly faster than 
Metropolis-Hastings, Hamiltonian MCMC, and slice sampling. Additionally, we find that 
for the three peak example, chaotic sampling approach is computationally 1.8X faster 
than Metropolis-Hastings, 3X faster than slice sampling, and 10X faster than Hamilto¬ 
nian MCMC. Furthermore, note that the chaotic sampling approach does not require the 
construction of a proposal distribution that can be complicated [15]. 

Further analysis of chaotic sampling is required in the context of Markov Chain Monte 
Carlo (MCMC) sampling. Primarily, we aim to address the requirement of computing 
Fourier integrals for chaotic sampling as well as constructing “sparse” representations 
of the probability distributions to reduce the number of coefficients in high dimensions. 
Additionally, we are also investigating the development of “hybrid” methods that first use 
traditional sampling approaches to obtain rough estimates of the Fourier integrals and 
then switch over to chaotic sampling. 
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Test Runs of Chaotic Sampling on various Pictures and Paintings 

In addition to the Mona Lisa simulations, we present the evolution of reproductions of 
various pictures and paintings using chaotic sampling. All simulations were run using 
explicit Euler integration with dt = 10~ 3 and [K x ,K y ] = [100,100]. Here we present 
time snapshots of the reproductions of Pines Switz (Figure S4) and Big Sur (Figure 
S5) photographs, and the Starry Night painting (Figure S6). These results, just as in 
the Mona Lisa example, are produced by computing a single trajectory for each color. 
These trajectories are superimposed to produce the overall picture. Additionally, the 
corresponding time evolution of the reproduction of all the paintings and pictures are 
captured in Movies S1-S4. One can clearly see the emergence of the pictures and paintings 
as the computation progresses. 



Fig. SI: A plot of the positive Lyapunov exponents vs. simulation time. The inset 
figure zooms into the interval t = [600,1000]. It can be clearly seen that three Lyapunov 
exponents are positive, thus implying chaos. 
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Fig. S2: The multi-modal distribution used in the comparison of chaotic sampling, 
Metropolis-Hastings, Hamiltonian MCMC, and slice sampling. 



Fig. S3: Comparison of convergence of chaotic sampling, Metropolis-Hastings, Hamilto¬ 
nian MCMC, and slice sampling. 
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(a) Time = 0.016 sec 



(b) Time = 0.031 sec 



(c) Time = 0.061 sec 



(d) Time = 0.091 sec 



(e) Time = 0.151 sec 


Fig. S4: Evolving reproduction of the Pines Switz photograph as recreated by chaotic 
sampling. The Erst frame is the superposition of the red, green, and blue frames. Note 
that the red, green, and blue frames are composed of a single trajectory for each color 
evolving over time. 
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(a) Time = 0.016 sec 



(b) Time = 0.031 sec 



(c) Time = 0.061 sec 



(d) Time = 0.091 sec 



(e) Time = 0.151 sec 


Fig. S5: Evolving reproduction of the Big Sur photograph as recreated by chaotic 
sampling. The first frame is the superposition of the red, green, and blue frames. Note 
that the red, green, and blue frames are composed of a single trajectory for each color 
evolving over time. 
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(a) Time = 0.016 sec 



(b) Time = 0.031 sec 



(c) Time = 0.061 sec 



(d) Time = 0.091 sec 



(e) Time = 0.151 sec 


Fig. S6: Evolving reproduction of the Starry Night painting as recreated by chaotic 
sampling. The first frame is the superposition of the red, green, and blue frames. Note 
that the red, green, and blue frames are composed of a single trajectory for each color 
evolving over time. 
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Movie SI 

The movie shows the chaotic sampling based reproduction of the Mona Lisa painting. 
The first frame is the superposition of the red, green, and blue frames. Note that the red, 
green, and blue frames are composed of a single trajectory for each color evolving over 
time. 

Movie S2 

The movie shows the chaotic sampling based reproduction of the Pines Switz photograph. 
The first frame is the superposition of the red, green, and blue frames. Note that the red, 
green, and blue frames are composed of a single trajectory for each color evolving over 
time. 

Movie S3 

The movie shows the chaotic sampling based reproduction of the Big Sur photograph. 
The first frame is the superposition of the red, green, and blue frames. Note that the red, 
green, and blue frames are composed of a single trajectory for each color evolving over 
time. 

Movie S4 

The movie shows the chaotic sampling based reproduction of the Starry Night painting. 
The first frame is the superposition of the red, green, and blue frames. Note that the red, 
green, and blue frames are composed of a single trajectory for each color evolving over 
time. 
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